L01, Week 1, Tom: Introduction
  Course overview (use forum, Jupyter, it's hard now, read papers)
  Problem examples
  Revise model vs inference.
  Bayesian revision.



L02, Week 1, Tom: Density estimation
  Introduce
  Histograms (you all do this...) - requirements to be a pmf/pdf.
  Triangular kernels (was dropped).
  Variable width bins.
  Kernel Density Estimates
  Mean Shift - segmentation example. (was dropped)

  Maximum likelihood Gaussian fitting (redux)
  Gaussian Mixture Model (redux)

  Above but Bayesian. (was dropped)
  Discuss mixture counts. BIC.

  Using for classification/regression via Bayes rule.
  Abnormality Detection - background subtraction example?



L03, Week 2, Tom: Incremental learning
  Motivation

  Batch approach (can use previous state as initialisation).
  Fast+slow models.
  Talk about engineering vs. buying lots of computers.
 
  Incremental mean, variance, median, percentile.

  Inc. linear regression.

  Conjugate prior example.

  Inc. Random forest.
  Incremental GMM (Fisher).



L04, Week 2, Tom: Active learning
  Active Learning concept.
  Simple clustering approach.
  Various metrics with demo.

  More advanced metrics, talk about time.

  Rare class active learning with DP approach.
  Variants.



L05, Week 3, Tom: Monte Carlo 
  Concept of Monte Carlo, motivation.
  CDF transform for drawing from 1D distributions.
  Numerical integration
  Calculating pi, Gantt chart probabilities.
  Rejection sampling, pi again?

  Importance sampling. (mention umbrella sampling as Physics equivalent)
  Normalisation not necessary!
  Big example of importance sampling?
  Mention adaptive sampling. Maybe demo?

  Stratified sampling.
  Slice sampling (to lead them into thinking about multiple steps)
  Example of above.



L06, Week 3, Tom: Markov chain Monte Carlo
  Concept - multiple steps.
  Detailed balance
  Metropolis-Hastings, inc. reject/accept rate discussion.
  Hamiltonian MC if there is time.

  Examples? Def. something Bayesian.



L07, Week 4, Tom: Gibbs sampling
  Curse of dimensionality as applied to metropolis Hastings and Hamiltonian - low acceptance rates.
  Do a few dimensions at a time.
  This on a graphical model = Gibbs sampling.
  Toy problem.
  Topic model (Latent Dirichlet allocation) inc. integrating out.
  Discus burn in, summarising, 'convergence'.



L08, Week 4, Tom: Dirichlet processes
  Parametric vs non-parametric - make sure they are clear on definition!
  Dirichlet process.
  Summarise measure theory.
  Make clear how it's different - over a distribution, not a distributions parameters.
  Makes continuous discrete.
  
  Chinese restaurant process
  DP-GMM (CRP only)
   
  Stick breaking construction - just mention.



L09, Week 5, Ken: Hidden Markov models
  Concept
  Kalman filter (kinda a redux)
  Tracking example.

  Particle filters
  Better tracking example.



L10, Week 5, Ken: Hyperparameter optimisation
  Find out who has done the manual approach.
  Relationship to optimisation: Slow evaluation and no gradients.
  Griding and random parameters - ideas from: "Random Search for Hyper-Parameter Optimization".
  Line search
  Genetic algorithms. Talk about limitations.
  Could include particle swarm, time permitting?
  Bayesian optimisation overview might work as a good lead into below.
  Thompson sampling.
  A/B testing using Thompson sampling.



L11, Week 6, Ken: Bayesian quadrature
  Gaussian process (redux)
  Bayesian quadrature
  Back to Thompson sampling.
  Loop back to hyperparameter optimisation, using above.



L12, Week 6, Ken: Ensembles
  Define ensemble.
  Bagging redux, but with theory.
  Bayesian bootstrap.
  Boosting.
  Viola-Jones face detector.
  Committees of experts.
  Great bells (Gaussian, Logistic, Cauchy) and how evidence from committee members combines for each choice.
  An example would be good, probably small?



L13, Week 7, Ken: Neural networks
  Just an intro for those who are not doing Kwang's unit.
  Would be nice to squeeze in weakly supervised learning, but not sure how.



L14, Week 7, Ken: Sparsity
  Sparsity
  Compressed sensing
  Differences between L1, L2 etc. Show contours and emphasis parameters going to zero as the advantage of L1.
  Missing data.
  Differences between noise/missing data/outliers.



<2 Week Easter Break>



L15, Week 8, Tom: Expectation maximisation
  Redux, more theoretical and as setup for next lecture.
  Revise EM GMM.
  Discuss from a lower bound POV.
  A second example?



L16, Week 8, Tom: Variational inference
  Bounds
  Small example
  Mean-field recipe.
  Topic model/LDA again.



L17, Week 9, Tom: Variational examples
  Second mean field example. Something involving transfer learning?
  Non-mean field example.
  Talk about approximation vs sampling - which choice is best.
  Details of Kolmogorov-Smirnov test and calibrating distributions.



L18, Week 9, Tom: Markov random fields
  MRF vs CRF difference
  Semantic segmentation
  Solve with ICM, BP, Gibbs, variational.
  Graph cuts if it can be squeezed in.



L19, Week 10, Neill: Duality
  Duality.
  MRF Energy Minimisation and Beyond via Dual Decomposition
  Some other optimisation methods? e.g. L-BFGS.



L20, Week 10, Ken: Federated learning
  What it is
  Example(s)
  Would be nice to squeeze in branch and bound and random kernel machines (first probably tricky, but second distributes well).



L21, Week 11, Tom: Causality
  Causality
  Graphical model inference
  Learning theory, though probably not too much.



L22, Week 12, Ken: Last words
  Overflow
  Revision
  How to find papers
  Identify right tool exercise



<1 Week Revision>
<3 Week Assessment>



Lab 1: 12% (1 week)
Bayesian Sherlock - practising their Bayesian reasoning!


Lab 2: 12% (1 week)
Alien Invasion - Density estimation and some basic sampling.


Lab 3: 12% (1 week)
MCMC? Maybe some kind of natural language processing?


Lab 4: 24% (4 weeks, 2 over Easter)
A group project.
Rather like the idea of a problem where there are many solutions and many parts, and they can select any; makes it easy to chop up!


Lab 5: 40% (4 weeks: 2 term weeks + revision week)
Choose paper from shortlist, present paper to group, implement and write up.

